-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemonset: differentiate between cases in nodeShouldRun #38787
Conversation
cc @davidopp |
@kubernetes/daemonset @kubernetes/sig-apps please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner, request for a slightly better test for the new method. But if it's covered sufficiently in your mind I'm ok with as is.
// * shouldContinueRunning: | ||
// Returns true when a daemonset should continue running on a node if a daemonset pod is already | ||
// running on that node. | ||
func (dsc *DaemonSetsController) nodeShouldRunDaemonPod(node *v1.Node, ds *extensions.DaemonSet) (wantToRun, shouldSchedule, shouldContinueRunning bool, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask you to do a table test for this method that shows the outcomes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
// If the daemon set specifies a node name, check that it matches with node.Name. | ||
if !(ds.Spec.Template.Spec.NodeName == "" || ds.Spec.Template.Spec.NodeName == node.Name) { | ||
return false | ||
wantToRun, shouldSchedule, shouldContinueRunning = false, false, false | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you inline the returned values like you do below?
return false, false, false, nil
} | ||
|
||
// TODO: Move it to the predicates | ||
for _, c := range node.Status.Conditions { | ||
if c.Type == v1.NodeOutOfDisk && c.Status == v1.ConditionTrue { | ||
return false | ||
// the kubelet will evict this pod if it needs to. Let kubelet | ||
// decide wether to continue running this pod so leave shouldContinueRunning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/wether/whether/
predicates.ErrPodAffinityNotMatch, | ||
predicates.ErrServiceAffinityViolated, | ||
predicates.ErrTaintsTolerationsNotMatch: | ||
return false, false, false, fmt.Errorf("unexpected reason: GeneralPrdicates should not return reason %s", reason.GetReason()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
predicates.ErrMaxVolumeCountExceeded, | ||
predicates.ErrNodeUnderMemoryPressure, | ||
predicates.ErrNodeUnderDiskPressure: | ||
shouldSchedule = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make it explicit that the other two booleans are true here?
for i := range dsList.Items { | ||
ds := &dsList.Items[i] | ||
shouldEnqueue := (dsc.nodeShouldRunDaemonPod(oldNode, ds) != dsc.nodeShouldRunDaemonPod(curNode, ds)) | ||
if shouldEnqueue { | ||
_, oc1, oc2, err := dsc.nodeShouldRunDaemonPod(oldNode, ds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear what the names mean. Maybe oldShouldSchedule, oldShouldContinueRunning
? Same for the names below.
b514bd1
to
9cc8c7c
Compare
@Kargakis @smarterclayton addressed all comments. |
// this one is probably intentional since it's a workaround for not having | ||
// pod hard anti affinity. | ||
predicates.ErrPodNotFitsHostPorts: | ||
wantToRun, shouldSchedule, shouldContinueRunning = false, false, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is node affinity something that doen't hold true for DaemonSets and works only for pods that are processed by the scheduler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis node affinity bubbles up here as a predicates.ErrNodeSelectorNotMatch. The DaemonSet obeys "required during scheduling" node affinities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, actually nevermind my question. Required/ignored during execution is yet to be implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically "required during execution" only is not implemented yet. Here you have implemented "required during execution" for daemon sets. My real question from the begining was "Is node affinity going to affect daemon sets? Will I be able to specify predicates in the pod template of a daemon set?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the plan is to support node affinity in daemonsets. We will need to modify this method when "required during execution" is implemented.
/lgtm |
secifically we need to differentiate between wanting to run, should run and should continue running. This is required to support all taint effects and will improve reporting and end user debuggability.
Automatic merge from submit-queue (batch tested with PRs 39483, 39088, 38787) |
specifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
fixes #28839 among other things